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Let a co-author C have written J (joint) pubhcations with someone. Rank aU the co-authors 
of that individual according to their number of joint publications, giving a rank r to each co- 
author, starting with r = 1 for the most prolific. Examining a finite set of researchers from a 
group of researchers from statistical physics, over a "long" time interval, a very simple relationship 
is empirically found between the number of joint publications J by coauthors and their rank of 
importance, i.e. J oc 1/r. Thus, in the same spirit as for the Hirsch core, one can define a "co- 
author core" , and introduce indices operating on an author. Numerical results adapted to the finite 
set hereby considered can be meaningfully interpreted. Therefore, variants and generalizations could 
be later produced in order to quantify co-author roles in a temporary or long lasting stable team. 

PACS numbers: 



I. INTRODUCTION 



In 1926, Lotka discovered that the number of authors is related to the number of published papers r, i.e. |23j . 

Tir ^ ni/r^. (1) 

where ni is the number of papers published by the most prolific author. Several other so called laws have been 
predicted or discovered about relations betvsreen time, number of publications, number of authors, number of citations, 
funds, dissertation production, citations, or the number of journals or scientific books.... etc. [7t ITl ] [20 l [30H32| l37 l I38j . 
Scientometrics has become a scientific field in itself ^[SIIHIIH]. Thus, statistical approaches and models based on the 
laws and distributions of Lotka, Pareto, Zipf-Mandelbrot, Bradford, Yule, and others, - see Table|T]for a summary, do 
provide much useful information for the analysis of the evolution of scientific systems in which development is closely 
connected to a process of idea diffusion and work collaboration. 

More recently, an index, the /i— index, has been proposed in order to quantify an individual's scientific research 
output |17) . A scientist has some index h if h of his/her papers have at least h citations each. A priori this h— 
value is based on journal articles. However, books, monographs, translations, edited proceedings, ... can be included 
in the measure. The latter may depend on the precision of the examined data basis. No need to say that the best 
should start from the official publication list of an author. However the number of citations varies according to the 
data basis. It is rather unusual that an author records by himself the citations of his/her papers. Sometimes, several 
citations go also unnoticed. The notion of core for a paper is also defined, as being a paper which has more than h 
citations. 

A review focusing on the many variants of the /i-index, e.g. the a— and e— indices, their computation and 
standardization can be found in ref. [T] . These indices operate as if it was the single-paper level. However, it is often 
discussed that inconsistencies may arise from self- citations and multi-authored papers. It is clear, without going into 
a long discussion, that the role and the impact of co-authors are difficult to measure or even estimate. One may 
even ask whether there are too many co-authors pSj. Yet, one is often confronted to such questions. To serve as a 
perspective view point, several considerations from the common literature are briefiy outlined here below, without 
arguing on the pro's and con's. 

Several disturbing, or controversial, effects of multi-authorship on citation impact, for example, have been shown 
in bibliometric studies by Persson et al. in 2004 |55]. However, Glanzel and Thijs,|T2j have shown that multi- 
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FIG. 1: log-log plot of the number of (joint) publications of coauthors ranked according to rank importance for the 5 team 
members; a few power law lines are indicated; the J ~ 300/r law is given as a guide to the eye 



authorship does not result in any exaggerate extent of self — citations. Moreover, self-citations can indicate some 
author creativity, or versatility at changing his/her field of research [2t [T4HI6] 

To take into account the effect of multiple co — authorship through the /i— index, Hirsch [18 even proposed the h 
index as being the number of papers of an individual that have a citation count larger than or equal to the /i— index of 
all co-authors of each paper. Of course, h < h. With the original /i-index a multiple-author paper in general belongs 
to the h-coie of some of its coauthors and not belong to the h-coie of the remaining coauthors. The h- index, unlike 
the /i-index, uniquely characterizes a paper as belonging or not belonging to the h-core of its authors. However, 
these considerations emphasize "papers" rather than "authors". Indeed, one focusses on a paper-core, not on an 
co-author-core. 

One may also wonder if co-authors must have all the same "value" in quantifying the "impact" of a paper. Sek- 
ercioglu proposed that the fc— th ranked co-author be considered to contribute I/fc as much as the first author [35] . 
highlighting an earlier proposal by Hagen [T3]. At the same time, Schreiber proposed the hm— index [531 [3S], count- 
ing the papers equally fractionally according to the number of authors; see also Egghe [9] giving an author of an 
TO— authored paper only a credit of c/m if the paper received c citations. Carbone [B] recently also proposed to give a 
weight to each i-th paper of the j— th individual according to the number of co-authors of this i-th paper, - /i 
being a parameter at first. Carbone argued that ambiguities in the e.g. h— index distribution of scientist populations 
are resolved if ~ 1/2. Other considerations can be summarized : (i) Zhang j39j has argued against Sekercioglu 
hyperbolic weight distribution, as missing the corresponding author, often the research leader. Zhang proposed that 
weighted citation numbers, calculated by multiplying regular citations by weight coefficients, remain the same as 
regular citations for the first and corresponding authors, who can be identical, but decreased linearly for authors 
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FIG. 2: log-log plot of the number of (joint) publications of co-authors ranked according to importance for the examined team 
members and outsiders ; (-1) power law lines are indicated for the two most prolific authors, MA and DS. Note the curvature 
at low rank 



with increasing rank; (ii) Galam llOj has recently proposed another fractional allocation scheme for contributions to 
a paper, imposing in contrast to Zhang, that the total weight of a paper equals 1, in fine leading to a g/i— index 
favorizing a "more equal" distribution of "co-author's weight" for more frequently quoted papers. Note that it differs 
from the /ig— index; see [T]. 

Other considerations have been given to co-authorship "problems". E.g., Nascimento et al. [2B found out that co- 
authorship is a small world network, from such a point of view, Borner et al. |5 used a weighted graph representation 
to illustrate the number of publications and their citations. However, even since Newman ^27) or more recently 
Mali et al. [23] and the subsequent works here above recalled, it seems that there have been considerations on the 
number of co-authors and their "rank", for one paper among many others of an individual, but no consideration 
in the sense of Hirsch, about ^Wanking^^ over a whole process. The present paper is an attempt to quantify the 
importance of co-authors, whence co-workers, in scientific publications, over a "long" time interval, thus to suggest 
further investigations about their effect on/in a team and more generally in a scientific career. 

The paper reports on two such aspects. First, in Sect. [llj an apparently not reported "law" is presented. Examining 
a finite set of researchers from a group of researchers, well known to the writer, performing and producing papers in 
statistical physics, a very simple relationship is empirically found: the number of joint publications J by co-authors 
C of a researcher and their rank r are related by J oc 1/r, like the number of publications P and their rank, in the h— 
index. The tail of the distribution seems undubious. A deviation occurs for individuals having few co-authors or a 
limited number of publications. Instead of a —1 slope on a log-log plot, one can observe a possible Zipf — Mandelbrot 
behaviour at small r, i.e. 
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FIG. 3: log-log plot of the number of (joint) publications of co-authors ranked according to importance; the best power law 
lines are indicated in each case 
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with C ~ 1. Next, in the spirit of Hirsch, one can define the core of co-authors of an individual, - not of a paper. In 
parallel to the ah— or eh— indices in the publication /cita tion ranking, one can thus imagine to introduce something 
and the rua a^- indices, in Sect. 



called here the ma— 
found in in Sect. Il3 



III 



Numerical illustrations are provided. A short conclusion is 



II. DATA 



To illustrate the empirical findings, about a relationship between the number of co-authors who have written P 
publications with someone over some time interval, consider the set of publications produced in statistical mechanics 
by a group of researchers connected at some time or another with the SUPRATECS Center of Excellence at the 
University of Liege, Liege, Belgium, at the end of the 20-th century. Let us consider the set made of 5 authors 
(MA,PC,AP,JP,JK) having various scientific careers, age, expertise or reputation, as given e.g. by their /i— index, 
with other relevant data given in Table |Tl] The CV and list of publications of the team members are available from 
the author. Let it be emphasized that the joint publications have covered different time spans. 
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FIG. 4: Selected lin-lin plot of the number of (joint) publications of coauthors ranked according to importance; this allows to 
define the core of co-authors for the author through an index rua; values are given in Table II 



Two other scientists are included for comparison: (i) one TK, a younger researcher having collaborated with the 
group, outside statistical mechanics research; (ii) DS, a researcher known as a guru in the field; see last two columns 
of Table mi 

When the data is not available from the CV or from the co-authors, the data has been taken using Google Scholar 
search engine. Care has been taken about the correctness of the references and citations. For example, JP has a 
homonym in another field. The number of citations, leading to the /i-index value, includes books when they are 
recorded as papers in the search engines, papers deposited on arXiv and papers published in proceedings, be they in 
a journal special issue or in a specific book-like form. Also recall that the number of citations till h divided by h is 
equal to the a— index. Due to the rather long publication list of MA and AP the total citation count till the end of 
the examined time interval, i.e., 2010, has not been possible for these. Note that it is somewhat amazing that for such 
a small number of authors, a hyperbolic Lotka-like law is verified with a i?^ ~ 0.995, though the exponent is close to 
3.0 (graph not shown). 

More interestingly, a log-log plot of the number (N.B. By an abuse of language, the data points are called freq for 
frequency. However no scaling has been made with respect to the total number of publications of each author) of joint 
publications between the five team members with either team or other partners, ranked in a decreasing order of joint 
contributions, is given in Fig. [ij It is remarkable that the power law exponent tends to —1, the more so if the number 
of publications of the authors becomes large. 

Comparison with the "outsiders" of the main team can be made as a test of the scientific field (ir-) relevance. A log- 
log plot of the number of joint publications versus ranked co-authors, be they partners or not, ranked in a decreasing 
order of joint contributions, is given in Fig. [2] It is remarkable that the power law exponent is very close to —1 for the 
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Lotka-Pareto law 


gives the number distribution/probability 
of scientists as a function of 
the number of papers they wrote 


a ~ l;n^ = C/{l + xy+°' 

Pi-) - ^ m"^" 


Yule distribution 


asymptotically 
corresponds to 
Lotka law 


pix) = {x, ^ + 1) = aB{x, a + 1), 
where B{x, q + 1) = r(a;)r(aa; + l)/T{x + a + 1), 
i.e.,p{x) oc r(a + l)a/x^+°' 


Zipf-Mandelbrot law 


ranks scientists 
by the number of papers 
they wrote 


C — ni, one has Xr — ni/{r + a) 

-'^ = (^) 
A=(C/a)i/"; B = C/(afcSiax); 7 = V"- 


Bradford law 


reflects the fact that most of the productivity R{n) 
of relevant articles by scientists are concentrated 
in a small number n of journals 


Rin)^ni ln(= + 1) . 



TABLE I: Bibliometric laws with a few words on their origin and/or usefulness; for more details see Sect. 6.1-6.2 in [37] 



most prolific authors, but the curvature at "lovi' rank" indicates that a Zipf-Mandelbrot-like form, Eq. (2) would be 
more appropriate. This is a very general feature of almost all Zipf plots. However another interesting feature concerns 
MA for which the 1 < r < 5 coauthors have an increasing relevance. One way to interpret this feature can be deduced 
from Table [nj It can be observed that the tenure year markedly differs for both authors. It can be understood that 
DS had more quickly possibilities of collaborations with co-authors than MA who had to list co-authors of hierarchical 
importance on joint publications during a longer time. This feature is similar to that in the analysis of texts when 
articles are a mandatory part of the language j^. 

Thereafter, one of the co-authors, JP, has been removed form the plots for clarity; JP has in fact a peculiar 
characteristics, since the researcher has no Ph.D. and has not continued publishing after participating in the team 
activities. Therefore, only the best fits by a power law to the 4 main team members and the 2 outsiders number of 
publications versus co-author rank are shown in Fig. [Sj The power law is remarkable with deviations from the -1 
slope as explained here above. 

III. CO-AUTHOR CORE 

Similarly to the definition of the "Hirsch core", along the ft,— index, or also the /i-index, concept, one can define the 
core of coauthors for an author. This value, called rria, is easily obtained from Fig. [4] in the cases so examined through 
a simple geometrical construction. Similarly to the a— index, one can define the TOq a^- index which measures the 
surface below the empirical data of the number of publications till rank rria ■ In so doing an Oa- index could be defined 
in parallel to some a^-index. The results are given in Table II. 

The interpretation of such results indicates the relative importance of working in a team or not. But also points 
toward further studies on time of activities. Indeed compare the Oa values for PC and TK, and observe their relative 
scientific career output as co-authors. Be aware that TK is an experimentalist and PC a theoretician, and started 
their career at different times. Yet they have a similar record of publications. However, TK , though being associated 
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MA 


PC 


AP 


JP 


JK 


TK 


DS 


born in 


1943 


1945 


1937 


1939 


1939 


1972 


1943 


Ph.D. in 


1973 


1973 


37 


none 


1973 


2001. 


1970 


tenure in 


1986 


1976 


1980 


none 


1995 


2007. 


1977 


1st publication in 


1971 


1974 


1966 


1983 


1967 


1997 


1967 


latest recorded publication in 


2010 


2010 


2010 


1983 


1999 


2010 


2010 


/i— index 


35 


11 


10 


2 


10 


6 


55 


a— index 


31.8 


26.9 


22.4 


7 


10. 


11. 


26. 


Numb, publications (<2011) 


571 


34 


111 


2 


60 


38 


640. 


Numb. ed. books (<2011) 


9 




8 




(2) 




10. 




Most often cited paper numb.cit. 


152 


127 


37 


7 


537 


41 


1430 


Tot. Numb, citations till h 


1113 


296 


224 


14 


745 


100 


8148 



Numb, co-authors 


317 


32 


46 


4 


38 


51 


285 


Numb. Publ. with "best" co-author 


155 


30 


21 


2 


13 


26 


30 


y . T. . 


1551 


95 


134 


8 


108 


181 


793 


skewness 


7.35 


4.66 


3.18 




2.18 


3.39 


3.98 



ma— index 


19 


4 


7 


2 


5 


6 


12 


ma a —index 


810 


46 


170 


4 


39 


76 


264 


tta- index 


42.6 


11.5 


24.3 


2 


7.8 


12.7 


22 



TABLE II: Data deduced from CV or Google Scholar on hereby examined scientist set 



in a loose way with a team, has almost the same aa (slightly greater than 1) as a stable senior partner, PC. Even 
though PC has many less co-authors. Similarly, compare aa for AP and DS, both with above 20, even though their 
number of co-authors is markedly different, - a ratio = 6.2, corresponding to an equivalent ratio of publications. 

IV. CONCLUSION 

Two main findings must be outlined as a summary and conclusion. It might have been thought that the number of 
co-authors of papers over a career might be related to the number of joint publications. But it was not obvious that 
a simple relationship should be found. In so finding, an interesting new measure of research teams follows. 

First, examining a finite set of researchers from a well known active group having performed an activity over decades 
in statistical physics, a very simple, though unexpectedly simple, relationship has been empirically found between the 
number of joint publications (J) by coauthors and their rank (r), i.e. 

Jocl/r. (3) 

Next, in the same spirit as for the Hirsch core, one can define a "co-author core", and introduce indices, like 
and aa, operating on an author. Numerical results adapted to the finite set hereby considered can be meaningfully 
interpreted. Therefore, variants and generalizations could be later produced in order to quantify co-author roles 
in a temporary team. The finite size of the sample is apparently irrelevant as an argument against the findings. 
Nevertheless, one could develop the above considerations, through a kind of network study. 

As a final point, let it be emphasized that even though co-authorship can be abusive [21] . it should not be stupidly 
scorned upon. Indeed in some cases, co-authorship and output are positively related. For instance, it has been shown 
that, for economists, more co-authorship is associated with higher quality, greater length, and greater frequency of 
publications |191 133| . Yet bibliometric indicators, as those nowadays discussed, can be useful parameters to evaluate 
the output of scientific research and to give some information on how scientists actually work and collaborate. Of 
course the present findings and the proposed indices are only a few of the possible quantitative ways to tackle the co- 
authorship problem. Different other methods can be investigated, with variants as those recalled in the Introduction. 
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However, they will never be the whole answer to evaluate the career of an individual nor to fund his/her research and 

team. But they arc easy smoke screens. 
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